FWIII: Characteristics in Local Populations

Susan Vanderplas & Muxin Hua

2023-11-06

General Overview

  1. Create a scanner which can collect local population data outdoors ✅ and indoors 🚧

    • Get people to walk across the scanner 😬

    • Collect large amounts of shoe pictures w/ time, date, location data 🥿👠👞

  2. Create an algorithm which can identify human-recognizable characteristics from scanner images

  3. Generate a local database of common footwear features

  4. Use local database to characterize frequency of similar shoes in local population

Tasks

In the 60s, Marvin Minsky assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they’d have the problem solved by the end of the summer. Half a century later, we’re still working on it.

Our Assumption in 2018

Picture of an African Elephant

African Elephant

A picture of an Asian elephant

Asian Elephant

If models can differentiate between types of elephants, they can identify shapes… right?

Circles

Quads

???

Types of Computer Vision Tasks

We can reasonably pose this problem in 3 different ways:

Classification: same-size regions labeled with one or more classes

Object Detection: Propose a bounding box and label for each object in an image

Image segmentation: find regions of the image and label each region

Each method requires a different type of labeling schema, data format, etc.

Some are more tedious to generate than others

Initial Approach (~2019)

  • Use VGG16 to classify 256x256 px chunks of images

  • Goal is to label the entire chunk with one or more classes

VGG16 Shoe Example approach

  • Hard to integrate predictions from each chunk back into the main image reliably

Next Approach (2021-2023)

  • Try to use object detection

    • Started with FastAI, but had terrible ongoing support

    • Moved to PyTorch, which is a lower level framework
      (Side note: Muxin is amazing at Python)

  • New developments:

    • Structured model so that the underlying network was replacable: can swap VGG16 for Resnet50

    • Implementing better metrics, e.g. Intersection over Union for assessing predictions vs. ground truth

Fundamental Problem

  • Neural networks are trained on millions of human-annotated real-world photos

  • Even shoe soles are artificial relative to a natural scene

  • Networks weren’t trained on the artificial patterns or layouts that are used to generate shoes

Pretrained NNs can still generate useful information that is computer-friendly (e.g. Charless’s method)

To work with artificial patterns and get human-like labels, we have to do something different.

Solution?

  1. Systematically generate a large library of synthetic data

    • pre-labeled

    • complex characteristics

    • will require several iterations to get right

    • Use to train preliminary model

  2. Run 2D patterns through Charless’s network to generate more realistic 3D images

    • Use to train a second-gen model

Solution?

  1. Run 2D patterns through Charless’s network to generate more realistic 3D images

    • Use to train a second-gen model
  2. Train on Zappos pictures labeled by humans

    • Update 2nd gen model weights (transfer learning)
  3. Train on Scanner Photos

    • Update 3rd gen model weights (messy data)

Solution?

Measure performance/accuracy changes over time on a consistent set of stimuli manually derived from real shoes

Synthetic Pattern Generation

Synthetic Data Generation

Region Layout

Synthetic Data Generation

Patterns

Snowflake

Hexagon Open Circle

Stud

Target Solid Circle

Circle Bar Across

Solid Circle Array

Targets with Arcs

6-pointed star

Synthetic Data Generation

Outlines

Synthetic Data Generation

Synthetic Data Generation

Advantages

  • SVGs can include metadata

  • Easy scaling

  • SVG intersection will allow marking partial objects

  • Region segmentation + image labels

Disadvantages

  • Manual SVG creation (8h \(\approx\) 52 images)

  • New R/python library to generate data

  • 3D rendering after 2D stage:

    • digital via OpenSCAD + SVG?
    • Can apply different surface colors
  • Lots of work required before we start in on photos

Questions?